llm: register sdpa variant #3802

lanluo-nvidia · 2025-08-29T23:11:55Z

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

peri044

Minor comments. Please update the supported models section in README.md and docs/user_guide.

tools/llm/torchtrt_ext/register_sdpa.py

tools/llm/torchtrt_ext/sdpa_converter.py

tools/llm/run_llm.py

peri044

Could you quickly try running these variants as well ?
a) google/gemma-3-4b-it
b) google/gemma-3-270m-it
Please update the supported model list here as well: https://github.com/pytorch/TensorRT/blob/main/docsrc/tutorials/compile_hf_models.rst
Could you add a testcase for 1 Gemma-3 decoder layer with sliding window attention ? The test case could be located at https://github.com/pytorch/TensorRT/tree/main/tests/py/dynamo/models as test_llm_models.py

peri044

LGTM

docsrc/tutorials/compile_hf_models.rst

register sdpa variant

2d5ffb5

lanluo-nvidia requested a review from peri044 August 29, 2025 23:11

meta-cla bot added the cla signed label Aug 29, 2025

github-actions bot added component: lowering Issues re: The lowering / preprocessing passes component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels Aug 29, 2025

github-actions bot requested a review from gs-olive August 29, 2025 23:12

test

7490b02

lanluo-nvidia marked this pull request as ready for review August 29, 2025 23:16

peri044 reviewed Aug 30, 2025

View reviewed changes

tools/llm/torchtrt_ext/register_sdpa.py Outdated Show resolved Hide resolved

tools/llm/torchtrt_ext/register_sdpa.py Show resolved Hide resolved

tools/llm/torchtrt_ext/sdpa_converter.py Show resolved Hide resolved

tools/llm/run_llm.py Show resolved Hide resolved

lanluo-nvidia added 2 commits September 2, 2025 11:08

resolve comments

d8e1ae0

Merge branch 'main' into lluo/register_sdpa_variant

89103e8

peri044 reviewed Sep 2, 2025

View reviewed changes

add llm test cases

0653f36

github-actions bot added documentation Improvements or additions to documentation component: tests Issues re: Tests labels Sep 3, 2025

peri044 approved these changes Sep 3, 2025

View reviewed changes

docsrc/tutorials/compile_hf_models.rst Show resolved Hide resolved

lanluo-nvidia merged commit 70874be into main Sep 3, 2025
87 of 100 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llm: register sdpa variant #3802

llm: register sdpa variant #3802

Uh oh!

lanluo-nvidia commented Aug 29, 2025

Uh oh!

peri044 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

peri044 left a comment

Uh oh!

peri044 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

llm: register sdpa variant #3802

llm: register sdpa variant #3802

Uh oh!

Conversation

lanluo-nvidia commented Aug 29, 2025

Description

Type of change

Checklist:

Uh oh!

peri044 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

peri044 left a comment

Choose a reason for hiding this comment

Uh oh!

peri044 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!